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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
BOARD OF PATENT APPEALS AND INTERFERENCES 



In re Application of 



Kruelen, et aL 



Serial No.: 09/848,430 



Group Art Unit: 2176 



Filed: 



May 4, 2001 



Examiner: Ries, Laurie Anne 



For: AN EFFICIENT STORAGE MECHANISM FOR REPRESENTING TERM 
OCCURRENCE IN UNSTRUCTURED TEXT DOCUMENTS 

Commissioner of Patents 
Alexanderia, VA 22313-1450 



Appellants respectfully respond to the arguments in the Examiner's Answer mailed 
on February 27, 2006, as focused particularly on section (10) Response to Argument, 
beginning on page 1 1 of the Answer. 

In general, Appellants submit that the rejection, as revised to recognize that claims 
3, 4, 7, 8, 1 1, and 12 are now allowable if rewritten in independent format, clearly 
demonstrates the insidious nature of improper hindsight. Appellants submit that the 
rejection merely uses the techniques taught in the disclosure of the present invention to 
adapt the technique taught in primary reference US Patent 5,895,470 to Pirolli et aL, to 
evaluate a website so that it would be executed in accordance with the concepts of the 
present invention. 

Appellants submit that such adaptation merely uses the claimed invention as a 
roadmap, given the distinct nature of the techniques taught in Pirolli and the very different 
nature of a website from that of a document corpus containing an ordered plurality of 
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Appellants' Response to Examiner's Answer 
S/N: 09/848,430 

documents. Appellants also submit that such adaptations would inherently change the 
underlying principles of operation described in Pirolli, which change of principle of 
operation would be improper in an obviousness evaluation. 

1. A website is not normally considered from the perspective of a document corpus 

Appellants first submit that, to even consider Pirolli as the primary reference, one 
having ordinary skill in the art would have to view evaluation of a website of linked 
documents in an entirely abnormal perspective as being a precisely-ordered arrangement of 
documents . Appellants submit that even Pirolli, at lines 27-28 of column 6 ("Thus, the 
walker produces a graph representation of the hyperlink structure of the Web locality"), 
does not consider a website to be precisely-ordered. 

Thus, even according to the primary reference itself, the environment of the 
primary reference would first have to be converted into the perspective of being a 
precisely-ordered listing of documents in order to satisfy the plain meaning of the claim 
language of the independent claims. Appellants submit that the Examiner's argument that 
there is a home page that can serve as a table of contents does not in any manner convert 
the perspective of the website into a precisely-ordered listing of documents, when the 
primary reference itself clearly describes its method as based upon the perspective of a 
graphical hyperlink structure. 

The most that can reasonably be said is that the Examiner's Answer points out that 
one could change the perspective of a website to become a precisely-ordered listing of 
documents, even though it typically is viewed by one having ordinary skill in the art from 
the perspective of a graphical hyperlink structure. 

In contrast, the present invention is directed specifically to a document corpus that 
presumes initially that the documents are precisely-ordered. Thus, the present invention 
does not include the initial step of imposing an arbitrary ordering of the documents 
typically viewed as interconnected in a graphically represented hyperlink structure, thereby 
initially converting the website into a document corpus. 

Therefore, Appellants submit that the imposing of an order based upon an arbitrary 

standard, such as considering the documents to be arranged in accordance with a table of 
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Appellants' Response to Examiner's Answer 
S/N: 09/848,430 

contents of a homepage does not convert a website into a document corpus of precisely- 
ordered documents, absent improper hindsight. 

2. There would be no reason to modify the standard perspective of a website, as 
demonstrated by the primary reference, into the perspective of a document corpus, since 
the primary reference addresses the evaluation of similarities of two documents on the 
website 

Appellants secondly submit that one having ordinary skill in the art would have no 
reasonable motivation to modify the technique taught in primary reference Pirolli into a 
technique based upon viewing a website as a document corpus. As clearly shown in 
Figure 1 1, the technique in Pirolli is directed to finding a similarity between the different 
documents on the website, exemplarily demonstrated as a matrix in which similarities of 
two documents are listed. Thus, it is clear that the analysis technique in Pirolli is based on 
a document-by-document analysis. There is clearly no need , in a technique in which one 
document of a website is compared with a second document on the website, to view the 
website as an entity to be represented as an entire database represented by a single vector, 
as if it were an ordered set of documents . 

In contrast, the present invention recognizes that, given a document corpus of 
ordered documents, an input query can be more efficiently exercised if all these ordered 
documents were represented as a single entity in format of a single data vector . The present 
inventors have adopted an entirely different perspective of a database, by considering the 
benefit of viewing the database in its entirety. 

Therefore, given the entirely different purpose of the primary reference (e.g., to 

determine similarity between two documents of a webside, the similarity being represented 

by a matrix having documents as reference nodes of the matrix coordinate axes, as shown 

in Figure 1 1 of Pirolli) from the purpose of the present invention (e.g., to represent 

information content of a plurality of documents in a single-vector format , possibly then 

available for matching with an input query), Appellants submit that one of ordinary skill in 

the art would not have been motivated to adopt an unconventional perspective that a 

website could be viewed as a document corpus, rather than a plurality of linked documents 
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Appellants' Response to Examiner's Answer 
S/N: 09/848,430 

represented graphically as having no specific ordering. Moreover, the task of comparing 
similarity between two documents does not benefit from this different perspective. 

It is the present invention that teaches the concept of a single data vector for 
information of a database. The Examiner's evaluation is clearly influenced by improper 
hindsight. 

3. Converting the technique of determining similarity of two documents within a 
website into a technique based upon representing data in a document corpus as a single 
vector of information would improperly change the principle of operation of the primary 
reference . 

As mentioned above, the technique in Pirolli ultimately produces a matrix of 
normalized similarity between any two documents in the website, as shown in Figure 11. 
Clearly, representing data of the website documents as buried in a single vector of 
information will only provide an undue burden on the technique in Pirolli that inherently 
depends upon evaluating two separate documents in isolation of other documents, 
particularly since the comparison further includes derivation of data that is outside the 
textual information content of the documents. 

That is, as clearly described in lines 63-64 of column 4 ("The raw data is comprised 
of topology information, page meta-information, page frequency path information and text 
similarity information."), the raw data used in Pirolli includes more than the text data on 
the web documents. Appellants submit that it would be very difficult for one of ordinary 
skill in the art even to convert this raw data of Pirolli into a format that is conducive to the 
representation of the present invention involving a single vector for the entirety of the 
website, since more than text data is involved. 

In contrast, the present invention is directed toward a document corpus having 
exemplarily only text data , which information is not interspersed with linkage information, 
meta-information, and usage data, such as present in Pirolli, that would have to be 
identified as a specific types of data in the single vector representation. 

There is certainly no such suggestion in Pirolli for such representation and 

Appellants submit that one having ordinary skill in the art would not be motivated to 
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attempt to represent raw data of Pirolli in the single vector of the present invention because 
of the burden that would be imposed in attempting to use this single-vector format in the 
evaluation method of Pirolli wherein two documents are to be compared for similarity. 

The closest that the technique of Pirolli comes to the purpose of the present 
invention would be the task of determining similarity of text, as described in lines 3 1-48 of 
column 7. The Examiner has not even attempted to define or discuss this method or to 
accordingly modify it to be executed in the manner of using a single vector of text data for 
the entire website. Appellants submit that, unless the Board wishes to explain this missing 
evaluation, the rejection currently of record is deficient for this reason alone. 

4. The combination of primary reference Pirolli and secondary reference Call is 
improper since these two references are non-analogous 

The Examiner continues to consider that Pirolli and Call are analogous because 
each is "... from the same field of endeavor of extracting and analyzing information from 
electronic documents." 

Appellants submit that this characterization demonstrates improper hindsight by 
attempting to consider references as analogous simply because there is some level of 
abstraction that can be articulated as providing a similarity. Even the separate USPTO 
classification of these two references clearly demonstrates that they are not analogous. 

Primary reference Pirolli involves determining similarity between documents at a 
website . In contrast, secondary reference Call involves representation of text data within a 
single document . Appellants submit that these are two distinct environments having two 
distinct purposes and clearly, therefore, are non-analogous. 

5. Even if Call were to be combined with Pirolli, the result would not satisfy the 
plain meaning of the independent claims . 

The Examiner considers that Call teaches "... developing an uninterrupted array of 

integers corresponding to an occurrence of terms." 

In response, Appellants submit that such an array in Call is relative to a single 

document , not a single vector representing text data of an entire document corpus, wherein 
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Appellants' Response to Examiner's Answer 
S/N: 09/848,430 

the boundaries of the documents are ignored in the data representation. That is, there is no 
suggestion in Call to "think outside the boundaries of a single document", as the present 
inventors have done. 

Moreover, as explained above, the raw data in Pirolli includes more than the text 
data of the website documents, thereby additionally indicating that the current evaluation 
merely picks and chooses from entirely different environments, different purposes, and 
different techniques because of improper hindsight. 

Appellants submit that one of ordinary skill in the art would not find it obvious, 
absent improper hindsight, to use the listing of integers described for a single document , as 
expanded to represent an entire database contents containing thousands/millions of 
individual documents. Moreover, Appellants submit that one would have to go outside of 
the concepts of secondary reference Call to use it for the raw data of Pirolli that contains 
data additional to simple text data. 

Because neither primary reference Pirolli nor secondary reference Call suggests 
using a single vector to represent information for an entire data base , even if these two 
references were to be combined, the result would not satisfy the plain meaning of the 
language of the independent claims relative to a single vector of data for the information 
content of an entire document corpus . 

As pointed out above, the inventors have taken a change of perspective in order to 
get outside the mindset of thinking of a document as an isolated entity by considering, 
instead, that a document corpus can be viewed as a database in its entirety for the purpose 
of representation of data. None of the prior art of record has any suggestion for this 
approach to represent data. 

6. Relative to the Examiner's argument on page 13 that normalization between two 
documents would render obvious normalization within a document, Appellants submit that 
such evaluation is tantamount to suggesting that "normalization" is considered to be a 
concept in the abstract that no longer can serve as an element in any combination of 
elements to define an invention distinct from any other "normalization" process, whether 
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the normalization involves a completely different method or involves a completely 
different entity or environment. 

Thus, this aspect of the current evaluation becomes an example by the USPTO of 
attempting non-statutory subject matter in reverse, wherein the term "normalization" is 
applied as an abstract idea that is no longer available as an element for defining a 
patentable combination. 

Appellants submit that such abstraction of terminology is clearly improper 
hindsight. 

CONCLUSION 

In view of the foregoing, Appellants submit that claims 1-25, all the claims 
presently pending in the application, are clearly enabled and patentably distinct from the 
prior art of record and in condition for allowance. Thus, the Board is respectfully 
requested to remove all rejections of claims 1-25. 

Please charge any deficiencies and/or credit any overpayments necessary to enter 
this paper to Assignee's Deposit Account number 09-0441 . 
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